Goto

Collaborating Authors

 sentence completion


Expect the unexpected: Harnessing Sentence Completion for Sarcasm Detection

Joshi, Aditya, Agrawal, Samarth, Bhattacharyya, Pushpak, Carman, Mark

arXiv.org Artificial Intelligence

The trigram `I love being' is expected to be followed by positive words such as `happy'. In a sarcastic sentence, however, the word `ignored' may be observed. The expected and the observed words are, thus, incongruous. We model sarcasm detection as the task of detecting incongruity between an observed and an expected word. In order to obtain the expected word, we use Context2Vec, a sentence completion library based on Bidirectional LSTM. However, since the exact word where such an incongruity occurs may not be known in advance, we present two approaches: an All-words approach (which consults sentence completion for every content word) and an Incongruous words-only approach (which consults sentence completion for the 50% most incongruous content words). The approaches outperform reported values for tweets but not for discussion forum posts. This is likely to be because of redundant consultation of sentence completion for discussion forum posts. Therefore, we consider an oracle case where the exact incongruous word is manually labeled in a corpus reported in past work. In this case, the performance is higher than the all-words approach. This sets up the promise for using sentence completion for sarcasm detection.


Evaluating Gender Bias in Large Language Models

Döll, Michael, Döhring, Markus, Müller, Andreas

arXiv.org Artificial Intelligence

Gender bias in artificial intelligence has become an important issue, particularly in the context of language models used in communication-oriented applications. This study examines the extent to which Large Language Models (LLMs) exhibit gender bias in pronoun selection in occupational contexts. The analysis evaluates the models GPT-4, GPT-4o, PaLM 2 Text Bison and Gemini 1.0 Pro using a self-generated dataset. The jobs considered include a range of occupations, from those with a significant male presence to those with a notable female concentration, as well as jobs with a relatively equal gender distribution. Three different sentence processing methods were used to assess potential gender bias: masked tokens, unmasked sentences, and sentence completion. In addition, the LLMs suggested names of individuals in specific occupations, which were then examined for gender distribution. The results show a positive correlation between the models' pronoun choices and the gender distribution present in U.S. labor force data. Female pronouns were more often associated with female-dominated occupations, while male pronouns were more often associated with male-dominated occupations. Sentence completion showed the strongest correlation with actual gender distribution, while name generation resulted in a more balanced 'politically correct' gender distribution, albeit with notable variations in predominantly male or female occupations. Overall, the prompting method had a greater impact on gender distribution than the model selection itself, highlighting the complexity of addressing gender bias in LLMs. The findings highlight the importance of prompting in gender mapping.


QueerBench: Quantifying Discrimination in Language Models Toward Queer Identities

Sosto, Mae, Barrón-Cedeño, Alberto

arXiv.org Artificial Intelligence

With the increasing role of Natural Language Processing (NLP) in various applications, challenges concerning bias and stereotype perpetuation are accentuated, which often leads to hate speech and harm. Despite existing studies on sexism and misogyny, issues like homophobia and transphobia remain underexplored and often adopt binary perspectives, putting the safety of LGBTQIA+ individuals at high risk in online spaces. In this paper, we assess the potential harm caused by sentence completions generated by English large language models (LLMs) concerning LGBTQIA+ individuals. This is achieved using QueerBench, our new assessment framework, which employs a template-based approach and a Masked Language Modeling (MLM) task. The analysis indicates that large language models tend to exhibit discriminatory behaviour more frequently towards individuals within the LGBTQIA+ community, reaching a difference gap of 7.2% in the QueerBench score of harmfulness.


LLMs4OL: Large Language Models for Ontology Learning

Giglou, Hamed Babaei, D'Souza, Jennifer, Auer, Sören

arXiv.org Artificial Intelligence

We propose the LLMs4OL approach, which utilizes Large Language Models (LLMs) for Ontology Learning (OL). LLMs have shown significant advancements in natural language processing, demonstrating their ability to capture complex language patterns in different knowledge domains. Our LLMs4OL paradigm investigates the following hypothesis: \textit{Can LLMs effectively apply their language pattern capturing capability to OL, which involves automatically extracting and structuring knowledge from natural language text?} To test this hypothesis, we conduct a comprehensive evaluation using the zero-shot prompting method. We evaluate nine different LLM model families for three main OL tasks: term typing, taxonomy discovery, and extraction of non-taxonomic relations. Additionally, the evaluations encompass diverse genres of ontological knowledge, including lexicosemantic knowledge in WordNet, geographical knowledge in GeoNames, and medical knowledge in UMLS.


SC-Ques: A Sentence Completion Question Dataset for English as a Second Language Learners

Liu, Qiongqiong, Huang, Yaying, Liu, Zitao, Huang, Shuyan, Chen, Jiahao, Zhao, Xiangyu, Lin, Guimin, Zhou, Yuyu, Luo, Weiqi

arXiv.org Artificial Intelligence

Sentence completion (SC) questions present a sentence with one or more blanks that need to be filled in, three to five possible words or phrases as options. SC questions are widely used for students learning English as a Second Language (ESL). In this paper, we present a large-scale SC dataset, \textsc{SC-Ques}, which is made up of 289,148 ESL SC questions from real-world standardized English examinations. Furthermore, we build a comprehensive benchmark of automatically solving the SC questions by training the large-scale pre-trained language models on the proposed \textsc{SC-Ques} dataset. We conduct detailed analysis of the baseline models performance, limitations and trade-offs. The data and our code are available for research purposes from: \url{https://github.com/ai4ed/SC-Ques}.


ChatGPT: Handle with care and don't be fooled into thinking it's human

#artificialintelligence

People yelling at broken computers was a popular YouTube genre in the 2000s. If you think this is unique to non-digital boomers when it comes to new technologies, think again. Something similar is happening today with ChatGPT, the chatbot recently launched by OpenAI, which has already generated hype not seen since the Metaverse was considered the next big thing (seems like eons ago, doesn't it?). We are not yelling at ChatGPT, but we interact with it as if it were a person, sometimes even accusing it of lying. "And, in a way, it's natural," says Heather Yang, an Assistant Professor at Bocconi Department of Management and Technology, whose research focuses on how people interact with novel technologies and how that is changing our workplace environment.


Effidit: Your AI Writing Assistant

Shi, Shuming, Zhao, Enbo, Tang, Duyu, Wang, Yan, Li, Piji, Bi, Wei, Jiang, Haiyun, Huang, Guoping, Cui, Leyang, Huang, Xinting, Zhou, Cong, Dai, Yong, Ma, Dongyang

arXiv.org Artificial Intelligence

In this technical report, we introduce Effidit (Efficient and Intelligent Editing), a digital writing assistant that facilitates users to write higher-quality text more efficiently by using artificial intelligence (AI) technologies. Previous writing assistants typically provide the function of error checking (to detect and correct spelling and grammatical errors) and limited text-rewriting functionality. With the emergence of large-scale neural language models, some systems support automatically completing a sentence or a paragraph. In Effidit, we significantly expand the capacities of a writing assistant by providing functions in five categories: text completion, error checking, text polishing, keywords to sentences (K2S), and cloud input methods (cloud IME). In the text completion category, Effidit supports generation-based sentence completion, retrieval-based sentence completion, and phrase completion. In contrast, many other writing assistants so far only provide one or two of the three functions. For text polishing, we have three functions: (context-aware) phrase polishing, sentence paraphrasing, and sentence expansion, whereas many other writing assistants often support one or two functions in this category. The main contents of this report include major modules of Effidit, methods for implementing these modules, and evaluation results of some key methods.


NLP Article #1 : A word is worth a 1000 pictures

#artificialintelligence

Natural Language Processing is quite simply the study and use of machines to intelligently use, and create, natural language. I purposely leave out the word'understand' for now, as it is a bit of a prickly subject when using probabilistic models. But the conundrum is the following: In terms of bit-rate, verbal communication is awful. A lecturer might utter 300 words per minute, which is the paltry rate of about 70 bytes/second. But switch to a two-way conversation, and that can drop even further to 100 words/minute, or a definitely anaemic 25 bytes/second.